Overview

Dataset statistics

Number of variables13
Number of observations2969
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory301.7 KiB
Average record size in memory104.0 B

Variable types

NUM13

Warnings

qty_items is highly correlated with gross_revenueHigh correlation
gross_revenue is highly correlated with qty_itemsHigh correlation
qty_prod_returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_ticket is highly correlated with qty_prod_returns and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 53.44422362) Skewed
frequency is highly skewed (γ1 = 24.88049136) Skewed
qty_prod_returns is highly skewed (γ1 = 50.89499407) Skewed
avg_basket_size is highly skewed (γ1 = 44.67271661) Skewed
df_index has unique values Unique
customer_id has unique values Unique
avg_ticket has unique values Unique
recency_days has 34 (1.1%) zeros Zeros
qty_prod_returns has 1480 (49.8%) zeros Zeros

Reproduction

Analysis started2021-06-08 19:19:15.287242
Analysis finished2021-06-08 19:19:40.295106
Duration25.01 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2317.277198
Minimum0
Maximum5715
Zeros1
Zeros (%)< 0.1%
Memory size23.2 KiB
2021-06-08T16:19:40.372176image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.4
Q1929
median2120
Q33537
95-th percentile5035.2
Maximum5715
Range5715
Interquartile range (IQR)2608

Descriptive statistics

Standard deviation1554.964441
Coefficient of variation (CV)0.67103083
Kurtosis-1.010787266
Mean2317.277198
Median Absolute Deviation (MAD)1271
Skewness0.3422499487
Sum6879996
Variance2417914.414
MonotocityStrictly increasing
2021-06-08T16:19:40.491284image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01< 0.1%
 
6091< 0.1%
 
5991< 0.1%
 
26481< 0.1%
 
6011< 0.1%
 
6031< 0.1%
 
51441< 0.1%
 
6051< 0.1%
 
26541< 0.1%
 
6071< 0.1%
 
Other values (2959)295999.7%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
57151< 0.1%
 
56961< 0.1%
 
56861< 0.1%
 
56801< 0.1%
 
56591< 0.1%
 

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.77299
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:40.620402image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.4
Q113799
median15221
Q316768
95-th percentile17964.6
Maximum18287
Range5940
Interquartile range (IQR)2969

Descriptive statistics

Standard deviation1718.990292
Coefficient of variation (CV)0.1125673398
Kurtosis-1.206094692
Mean15270.77299
Median Absolute Deviation (MAD)1488
Skewness0.03160785866
Sum45338925
Variance2954927.624
MonotocityNot monotonic
2021-06-08T16:19:40.740648image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
163841< 0.1%
 
181641< 0.1%
 
129331< 0.1%
 
129351< 0.1%
 
149841< 0.1%
 
170331< 0.1%
 
137041< 0.1%
 
129391< 0.1%
 
170371< 0.1%
 
141251< 0.1%
 
Other values (2959)295999.7%
 
ValueCountFrequency (%) 
123471< 0.1%
 
123481< 0.1%
 
123521< 0.1%
 
123561< 0.1%
 
123581< 0.1%
 
ValueCountFrequency (%) 
182871< 0.1%
 
182831< 0.1%
 
182821< 0.1%
 
182771< 0.1%
 
182761< 0.1%
 

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2963
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2749.321711
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:40.871936image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.77
Q1570.96
median1086.92
Q32308.06
95-th percentile7219.68
Maximum279138.02
Range279131.82
Interquartile range (IQR)1737.1

Descriptive statistics

Standard deviation10580.62331
Coefficient of variation (CV)3.848448607
Kurtosis353.944724
Mean2749.321711
Median Absolute Deviation (MAD)672.16
Skewness16.77755612
Sum8162736.16
Variance111949589.6
MonotocityNot monotonic
2021-06-08T16:19:40.987041image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
379.6520.1%
 
533.3320.1%
 
745.0620.1%
 
734.9420.1%
 
731.920.1%
 
33120.1%
 
719.781< 0.1%
 
13375.871< 0.1%
 
447.641< 0.1%
 
567.361< 0.1%
 
Other values (2953)295399.5%
 
ValueCountFrequency (%) 
6.21< 0.1%
 
13.31< 0.1%
 
151< 0.1%
 
36.561< 0.1%
 
451< 0.1%
 
ValueCountFrequency (%) 
279138.021< 0.1%
 
259657.31< 0.1%
 
194550.791< 0.1%
 
168472.51< 0.1%
 
140450.721< 0.1%
 

recency_days
Real number (ℝ≥0)

ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.28763894
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Memory size23.2 KiB
2021-06-08T16:19:41.114156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.75677911
Coefficient of variation (CV)1.209513686
Kurtosis2.777962659
Mean64.28763894
Median Absolute Deviation (MAD)26
Skewness1.798379538
Sum190870
Variance6046.116697
MonotocityNot monotonic
2021-06-08T16:19:41.235266image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1993.3%
 
4872.9%
 
2852.9%
 
3852.9%
 
8762.6%
 
10672.3%
 
9662.2%
 
7662.2%
 
17642.2%
 
16551.9%
 
Other values (262)221974.7%
 
ValueCountFrequency (%) 
0341.1%
 
1993.3%
 
2852.9%
 
3852.9%
 
4872.9%
 
ValueCountFrequency (%) 
37320.1%
 
37240.1%
 
3711< 0.1%
 
3681< 0.1%
 
36640.1%
 

qty_baskets
Real number (ℝ≥0)

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.723139104
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:41.371390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.85653132
Coefficient of variation (CV)1.547495379
Kurtosis190.8344494
Mean5.723139104
Median Absolute Deviation (MAD)2
Skewness10.76680458
Sum16992
Variance78.43814702
MonotocityNot monotonic
2021-06-08T16:19:41.494502image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
278526.4%
 
349916.8%
 
439313.2%
 
52378.0%
 
11906.4%
 
61735.8%
 
71384.6%
 
8983.3%
 
9692.3%
 
10551.9%
 
Other values (46)33211.2%
 
ValueCountFrequency (%) 
11906.4%
 
278526.4%
 
349916.8%
 
439313.2%
 
52378.0%
 
ValueCountFrequency (%) 
2061< 0.1%
 
1991< 0.1%
 
1241< 0.1%
 
971< 0.1%
 
9120.1%
 

qty_items
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1671
Distinct (%)56.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1608.852476
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:41.622618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile102.4
Q1296
median641
Q31401
95-th percentile4407.4
Maximum196844
Range196843
Interquartile range (IQR)1105

Descriptive statistics

Standard deviation5887.578045
Coefficient of variation (CV)3.659489067
Kurtosis465.998084
Mean1608.852476
Median Absolute Deviation (MAD)422
Skewness17.85859125
Sum4776683
Variance34663575.24
MonotocityNot monotonic
2021-06-08T16:19:41.751736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
310110.4%
 
8890.3%
 
15090.3%
 
28880.3%
 
27280.3%
 
8480.3%
 
24680.3%
 
26080.3%
 
49370.2%
 
13470.2%
 
Other values (1661)288697.2%
 
ValueCountFrequency (%) 
11< 0.1%
 
220.1%
 
1220.1%
 
161< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
1968441< 0.1%
 
809971< 0.1%
 
802631< 0.1%
 
773731< 0.1%
 
699931< 0.1%
 

qty_stockcode
Real number (ℝ≥0)

Distinct468
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.7241495
Minimum1
Maximum7838
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:41.900871image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7838
Range7837
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.8964081
Coefficient of variation (CV)2.199211884
Kurtosis354.8611303
Mean122.7241495
Median Absolute Deviation (MAD)44
Skewness15.70763473
Sum364368
Variance72844.07112
MonotocityNot monotonic
2021-06-08T16:19:42.027987image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
28431.4%
 
20371.2%
 
35351.2%
 
29351.2%
 
19341.1%
 
15331.1%
 
11321.1%
 
26311.0%
 
27301.0%
 
25301.0%
 
Other values (458)262988.5%
 
ValueCountFrequency (%) 
160.2%
 
2140.5%
 
3160.5%
 
4170.6%
 
5260.9%
 
ValueCountFrequency (%) 
78381< 0.1%
 
56731< 0.1%
 
50951< 0.1%
 
45801< 0.1%
 
26981< 0.1%
 

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
UNIQUE

Distinct2969
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.89776151
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:42.162108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.916661099
Q113.11933333
median17.95658654
Q324.98828571
95-th percentile90.497
Maximum56157.5
Range56155.34941
Interquartile range (IQR)11.86895238

Descriptive statistics

Standard deviation1036.934407
Coefficient of variation (CV)19.98033011
Kurtosis2890.707126
Mean51.89776151
Median Absolute Deviation (MAD)5.984842033
Skewness53.44422362
Sum154084.4539
Variance1075232.964
MonotocityNot monotonic
2021-06-08T16:19:42.278214image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
17.492758621< 0.1%
 
15.413636361< 0.1%
 
18.150615381< 0.1%
 
17.943444441< 0.1%
 
43.21921< 0.1%
 
33.535714291< 0.1%
 
9.4182926831< 0.1%
 
19.557670451< 0.1%
 
132.07389831< 0.1%
 
16.807222221< 0.1%
 
Other values (2959)295999.7%
 
ValueCountFrequency (%) 
2.1505882351< 0.1%
 
2.43251< 0.1%
 
2.4623711341< 0.1%
 
2.5112413791< 0.1%
 
2.5153333331< 0.1%
 
ValueCountFrequency (%) 
56157.51< 0.1%
 
4453.431< 0.1%
 
3202.921< 0.1%
 
1687.21< 0.1%
 
952.98751< 0.1%
 

avg_recency_days
Real number (ℝ≥0)

Distinct1257
Distinct (%)42.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.34805267
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:42.409333image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.92307692
median48.28571429
Q385.33333333
95-th percentile201
Maximum366
Range365
Interquartile range (IQR)59.41025641

Descriptive statistics

Standard deviation63.54523638
Coefficient of variation (CV)0.9435348739
Kurtosis4.887024667
Mean67.34805267
Median Absolute Deviation (MAD)26.28571429
Skewness2.062752894
Sum199956.3684
Variance4037.997067
MonotocityNot monotonic
2021-06-08T16:19:42.530443image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
14250.8%
 
4220.7%
 
70210.7%
 
7200.7%
 
35190.6%
 
49180.6%
 
21170.6%
 
11170.6%
 
46170.6%
 
5160.5%
 
Other values (1247)277793.5%
 
ValueCountFrequency (%) 
1160.5%
 
1.51< 0.1%
 
2130.4%
 
2.51< 0.1%
 
2.6013986011< 0.1%
 
ValueCountFrequency (%) 
3661< 0.1%
 
3651< 0.1%
 
3631< 0.1%
 
3621< 0.1%
 
35720.1%
 

frequency
Real number (ℝ≥0)

SKEWED

Distinct1225
Distinct (%)41.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1137973039
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:42.664565image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008894164194
Q10.01633986928
median0.02588996764
Q30.04945054945
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.03311068017

Descriptive statistics

Standard deviation0.4081562524
Coefficient of variation (CV)3.586695275
Kurtosis989.3650758
Mean0.1137973039
Median Absolute Deviation (MAD)0.0121913375
Skewness24.88049136
Sum337.8641954
Variance0.1665915263
MonotocityNot monotonic
2021-06-08T16:19:42.786676image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11986.7%
 
0.0625180.6%
 
0.02777777778170.6%
 
0.02380952381160.5%
 
0.08333333333150.5%
 
0.09090909091150.5%
 
0.03448275862140.5%
 
0.02941176471140.5%
 
0.03571428571130.4%
 
0.02564102564130.4%
 
Other values (1215)263688.8%
 
ValueCountFrequency (%) 
0.0054495912811< 0.1%
 
0.0054644808741< 0.1%
 
0.0054794520551< 0.1%
 
0.0054945054951< 0.1%
 
0.00558659217920.1%
 
ValueCountFrequency (%) 
171< 0.1%
 
31< 0.1%
 
260.2%
 
1.1428571431< 0.1%
 
11986.7%
 

qty_prod_returns
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct215
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.30953183
Minimum0
Maximum80995
Zeros1480
Zeros (%)49.8%
Memory size23.2 KiB
2021-06-08T16:19:42.922799image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile102.2
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1522.090875
Coefficient of variation (CV)23.30579982
Kurtosis2696.42626
Mean65.30953183
Median Absolute Deviation (MAD)1
Skewness50.89499407
Sum193904
Variance2316760.633
MonotocityNot monotonic
2021-06-08T16:19:43.051917image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0148049.8%
 
11645.5%
 
21485.0%
 
31053.5%
 
4893.0%
 
6782.6%
 
5612.1%
 
12511.7%
 
8431.4%
 
7431.4%
 
Other values (205)70723.8%
 
ValueCountFrequency (%) 
0148049.8%
 
11645.5%
 
21485.0%
 
31053.5%
 
4893.0%
 
ValueCountFrequency (%) 
809951< 0.1%
 
93601< 0.1%
 
90141< 0.1%
 
80041< 0.1%
 
44271< 0.1%
 

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct1979
Distinct (%)66.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.8137641
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:43.193045image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.25
median172.3333333
Q3281.6923077
95-th percentile600
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.4423077

Descriptive statistics

Standard deviation791.5551894
Coefficient of variation (CV)3.168581172
Kurtosis2255.538236
Mean249.8137641
Median Absolute Deviation (MAD)83.08333333
Skewness44.67271661
Sum741697.0657
Variance626559.6179
MonotocityNot monotonic
2021-06-08T16:19:43.322162image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
100110.4%
 
114100.3%
 
8290.3%
 
7390.3%
 
8690.3%
 
13680.3%
 
7580.3%
 
8880.3%
 
6080.3%
 
16370.2%
 
Other values (1969)288297.1%
 
ValueCountFrequency (%) 
120.1%
 
21< 0.1%
 
3.3333333331< 0.1%
 
5.3333333331< 0.1%
 
5.6666666671< 0.1%
 
ValueCountFrequency (%) 
40498.51< 0.1%
 
6009.3333331< 0.1%
 
42821< 0.1%
 
39061< 0.1%
 
3868.651< 0.1%
 

avg_unique_products
Real number (ℝ≥0)

Distinct1005
Distinct (%)33.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.1547082
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-08T16:19:43.461289image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.345454545
Q110
median17.2
Q327.75
95-th percentile56.94
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation19.51232207
Coefficient of variation (CV)0.8807302672
Kurtosis27.70329723
Mean22.1547082
Median Absolute Deviation (MAD)8.2
Skewness3.499455899
Sum65777.32865
Variance380.7307127
MonotocityNot monotonic
2021-06-08T16:19:43.582399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
13531.8%
 
14391.3%
 
11381.3%
 
20331.1%
 
9331.1%
 
1321.1%
 
17311.0%
 
10301.0%
 
18301.0%
 
16291.0%
 
Other values (995)262188.3%
 
ValueCountFrequency (%) 
1321.1%
 
1.21< 0.1%
 
1.251< 0.1%
 
1.33333333320.1%
 
1.580.3%
 
ValueCountFrequency (%) 
299.70588241< 0.1%
 
2591< 0.1%
 
203.51< 0.1%
 
1481< 0.1%
 
1451< 0.1%
 

Interactions

2021-06-08T16:19:17.905621image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.032736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.154847image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.274956image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.401071image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.518177image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.650297image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.779415image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:18.898523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.026142image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.153258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.283376image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.504577image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.627689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.746797image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.862902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:19.981009image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.104121image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.216223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.345341image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.472456image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.588587image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.710858image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.834095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:20.963212image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.090328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.209436image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.327543image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.443649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.559754image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.682866image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.794968image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:21.922083image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.048198image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.163302image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.285413image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.407524image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.535640image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.662756image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.780863image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:22.908980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.033092image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.156204image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.287324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.408434image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.544557image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.680681image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.805794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:23.936914image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:24.067032image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:24.203156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:24.340280image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:24.581499image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:24.695603image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:24.806704image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:24.918806image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.035912image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.141008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.263118image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.383227image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.489324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.603427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.717531image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.835638image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:25.952745image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.060843image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.190961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.316075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.441189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.573308image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.693418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.830542image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:26.966666image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:27.092781image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:27.225901image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:27.358021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:27.501151image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:27.634272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:27.763389image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:27.895509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.024627image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.153744image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.288867image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.413980image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.557111image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.691232image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.815345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:28.947465image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.077087image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.210207image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.352336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.484457image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.602564image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.719670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.835776image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:29.956886image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:30.066986image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:30.193100image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:30.319215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:30.431317image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:30.552427image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:30.808943image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:30.932096image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.055207image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.166309image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.291422image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.416536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.541649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.671767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.788874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:31.913988image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.043105image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.169219image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.300339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.431458image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.568582image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.707708image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.835825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:32.963941image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.090056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.214169image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.340283image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.455388image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.587507image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.717626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.838736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:33.963850image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.089964image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.223085image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.348199image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.469309image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.596424image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.725541image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.848653image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:34.981774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:35.103885image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:35.246014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:35.389144image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:35.519263image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:35.656387image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:35.795513image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:35.940645image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:36.081773image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:36.213893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:36.345013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:36.473129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:36.603247image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:36.741372image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:36.865485image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.007614image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.148743image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.278861image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.414985image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.552109image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.696240image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.838369image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:37.973492image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:38.097605image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:38.218715image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:38.340825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:38.644101image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:38.761208image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:38.894328image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:39.030958image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:39.154070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:39.283187image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:39.410303image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:39.544425image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:39.676544image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-06-08T16:19:43.704510image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-08T16:19:43.932717image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-08T16:19:44.387130image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-08T16:19:44.616337image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-08T16:19:39.899747image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-08T16:19:40.176999image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqty_basketsqty_itemsqty_stockcodeavg_ticketavg_recency_daysfrequencyqty_prod_returnsavg_basket_sizeavg_unique_products
00178505,391.21372.0034.001,733.00297.0018.1535.5017.0040.0050.978.74
11130473,232.5956.009.001,390.00171.0018.9027.250.0335.00154.4419.00
22125836,705.382.0015.005,028.00232.0028.9023.190.0450.00335.2015.47
3313748948.2595.005.00439.0028.0033.8792.670.020.0087.805.60
4415100876.00333.003.0080.003.00292.008.600.0722.0026.671.00
55152914,623.3025.0014.002,102.00102.0045.3323.200.0429.00150.147.29
66146885,630.877.0021.003,621.00327.0017.2218.300.06399.00172.4315.57
77178095,411.9116.0012.002,057.0061.0088.7235.700.0341.00171.425.08
881531160,767.900.0091.0038,194.002,379.0025.544.140.24474.00419.7126.14
99160982,005.6387.007.00613.0067.0029.9347.670.020.0087.579.57

Last rows

df_indexcustomer_idgross_revenuerecency_daysqty_basketsqty_itemsqty_stockcodeavg_ticketavg_recency_daysfrequencyqty_prod_returnsavg_basket_sizeavg_unique_products
29595627177271,060.2515.001.00645.0066.0016.066.001.006.00645.0066.00
2960563717232421.522.002.00203.0036.0011.7112.000.150.00101.5018.00
2961563817468137.0010.002.00116.005.0027.404.000.400.0058.002.50
2962564913596697.045.002.00406.00166.004.207.000.250.00203.0083.00
29635655148931,237.859.002.00799.0073.0016.962.000.670.00399.5036.50
2964565912479473.2011.001.00382.0030.0015.774.001.0034.00382.0030.00
2965568014126706.137.003.00508.0015.0047.083.000.7550.00169.335.00
29665686135211,092.391.003.00733.00435.002.514.500.300.00244.33145.00
2967569615060301.848.004.00262.00120.002.521.002.000.0065.5030.00
2968571512558269.967.001.00196.0011.0024.546.001.00196.00196.0011.00